We describe a number of recently developed techniques for improving theperformance of large-scale nuclear configuration interaction calculations onhigh performance parallel computers. We show the benefit of using apreconditioned block iterative method to replace the Lanczos algorithm that hastraditionally been used to perform this type of computation. The rapidconvergence of the block iterative method is achieved by a proper choice ofstarting guesses of the eigenvectors and the construction of an effectivepreconditioner. These acceleration techniques take advantage of specialstructure of the nuclear configuration interaction problem which we discuss indetail. The use of a block method also allows us to improve the concurrency ofthe computation, and take advantage of the memory hierarchy of modernmicroprocessors to increase the arithmetic intensity of the computationrelative to data movement. We also discuss implementation details that arecritical to achieving high performance on massively parallel multi-coresupercomputers, and demonstrate that the new block iterative solver is two tothree times faster than the Lanczos based algorithm for problems of moderatesizes on a Cray XC30 system.
展开▼